|
A database shard is a horizontal partition of data in a database or search engine. Each individual partition is referred to as a shard or database shard. Each shard is held on a separate database server instance, to spread load. Some data within a database remains present in all shards,〔Typically 'supporting' data such as dimension tables〕 but some only appears in a single shard. Each shard (or server) acts as the ''single'' source for this subset of data. == Database architecture == Horizontal partitioning is a database design principle whereby ''rows'' of a database table are held separately, rather than being split into columns (which is what normalization and vertical partitioning do, to differing extents). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location. There are numerous advantages to the horizontal partitioning approach. Since the tables are divided and distributed into multiple servers, the total number of rows in each table in each database is reduced. This reduces index size, which generally improves search performance. A database shard can be placed on separate hardware, and multiple shards can be placed on multiple machines. This enables a distribution of the database over a large number of machines, which means that the load can be spread out over multiple machines, greatly improving performance. In addition, if the database shard is based on some real-world segmentation of the data (e.g., European customers v. American customers) then it may be possible to infer the appropriate shard membership easily and automatically, and query only the relevant shard.〔(【引用サイトリンク】title=Shard - A Database Design ) 〕 Disadvantages include : * A heavier reliance on the interconnect between servers * Increased latency when querying, especially where more than one shard must be searched. * * Data or indexes are often only sharded one way, so that some searches are optimal, and others are slow or impossible. * Issues of consistency and durability due to the more complex failure modes of a set of servers, which often result in systems making no guarantees about cross-shard consistency or durability. In practice, sharding is complex. Although it has been done for a long time by hand-coding (especially where rows have an obvious grouping, as per the example above), this is often inflexible. There is a desire to support sharding automatically, both in terms of adding code support for it, and for identifying candidates to be sharded separately. Consistent hashing is one form of automatic sharding to spread large loads across multiple smaller services and servers.〔(【引用サイトリンク】first=Eric )〕 Where distributed computing is used to separate load between multiple servers (either for performance or reliability reasons), a shard approach may also be useful. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Shard (database architecture)」の詳細全文を読む スポンサード リンク
|